Java Implementation Method Example for Converting word to html [doc and docx Formats]
- 2021-07-01 07:38:23
- OfStack
In this paper, an example is given to describe the method of transforming word into html by Java. Share it for your reference, as follows:
public static void main(String[] args) throws Exception {
String filePath = "C:/Users/Administrator/Desktop/92 Diagnosis and treatment programs and clinical pathways /";
File file = new File(filePath);
File[] files = file.listFiles();
String name = null;
for (File file2 : files) {
Thread.sleep(500);
name = file2.getName().substring(0, file2.getName().lastIndexOf("."));
System.out.println(file2.getName());
if (file2.getName().endsWith(".docx") || file2.getName().endsWith(".DOCX")) {
CaseHtm.docx(filePath ,file2.getName(),name +".htm");
}else{
CaseHtm.dox(filePath ,file2.getName(),name +".htm");
}
}
}
/**
* Conversion docx
* @param filePath
* @param fileName
* @param htmlName
* @throws Exception
*/
public static void docx(String filePath ,String fileName,String htmlName) throws Exception{
final String file = filePath + fileName;
File f = new File(file);
// ) Loading word Document generation XWPFDocument Object
InputStream in = new FileInputStream(f);
XWPFDocument document = new XWPFDocument(in);
// ) Analyse XHTML Configure ( Set here IURIResolver To set the directory where the pictures are stored )
File imageFolderFile = new File(filePath);
XHTMLOptions options = XHTMLOptions.create().URIResolver(new FileURIResolver(imageFolderFile));
options.setExtractor(new FileImageExtractor(imageFolderFile));
options.setIgnoreStylesIfUnused(false);
options.setFragment(true);
// ) Will XWPFDocument Convert to XHTML
OutputStream out = new FileOutputStream(new File(filePath + htmlName));
XHTMLConverter.getInstance().convert(document, out, options);
}
/**
* Conversion doc
* @param filePath
* @param fileName
* @param htmlName
* @throws Exception
*/
public static void dox(String filePath ,String fileName,String htmlName) throws Exception{
final String file = filePath + fileName;
InputStream input = new FileInputStream(new File(file));
HWPFDocument wordDocument = new HWPFDocument(input);
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
// Analyse word Document
wordToHtmlConverter.processDocument(wordDocument);
Document htmlDocument = wordToHtmlConverter.getDocument();
File htmlFile = new File(filePath + htmlName);
OutputStream outStream = new FileOutputStream(htmlFile);
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(outStream);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer serializer = factory.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
outStream.close();
}
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>fr.opensagres.xdocreport.document</artifactId>
<version>1.0.5</version>
</dependency>
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId>
<version>1.0.5</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.12</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>3.12</version>
</dependency>
More readers interested in java algorithm can check the topics of this site: "Summary of Java File and Directory Operation Skills", "Java Data Structure and Algorithm Tutorial", "Summary of Java Operation DOM Node Skills" and "Summary of Java Cache Operation Skills"
I hope this article is helpful to everyone's java programming.